Identifying the L1 of non-native writers: the CMU-Haifa system
نویسندگان
چکیده
We show that it is possible to learn to identify, with high accuracy, the native language of English test takers from the content of the essays they write. Our method uses standard text classification techniques based on multiclass logistic regression, combining individually weak indicators to predict the most probable native language from a set of 11 possibilities. We describe the various features used for classification, as well as the settings of the classifier that yielded the highest accuracy.
منابع مشابه
Move-based investigation of appraisal in the introduction section of Applied Linguistics research articles: Similarities and differences between L1 and L2 English texts
Recent research has shown that academic writing is not ‘author-evacuated’ but, rather, carries a representation of the writers’ identity. One way through which writers project their identity in academic writing is stance-taking toward propositions advanced in the text. Appropriate stance-taking has proved to be challenging for novice writers of Research Articles (RAs), especially those writing ...
متن کاملA Comparative Analysis of Self-Mentions in Applied Linguistics PhD Dissertations Written by Native and Non-Native English Writers
The purpose of the present study was to compare the PhD dissertations written by native and nonnative English writers in the field of Applied Linguistics with regard to the use of self-mentions. To this end, 40 Applied Linguistics PhD dissertations (20 written by native English writers and 20 by non-native English writers), were selected randomly among academic texts written in 2007-2017. The p...
متن کاملMetadiscourse Elements in English Research Articles Written by Native English and Non-native Iranian Writers in Applied Linguistics and Civil Engineering
This study investigated metadiscourse and its subcategories in English research articles (RAs) written by nonnative (Iranian) and native English writers from the two disciplines of applied linguistics and civil engineering. The study aimed at seeing whether language and discipline influenced the frequency of occurrence of metadiscourse elements in research articles. To this end, a sample of 120...
متن کاملClause Complexity in Applied Linguistics Research Article Abstracts by Native and Non-Native English Writers: Taxis, Expansion and Projection
Halliday’s Systemic Functional Linguistics (SFL) has stood the test of time as a model of text analysis. The present literature contains a plethora of studies that while taking the ‘clause’ as a unit of analysis have put into investigation the metafunctions in research articles of a single field of study or those of various fields in comparison. Although ‘clause complex’ is another unit of SF a...
متن کاملHedges and Boosters in Academic Writing: Native vs. Non-Native Research Articles in Applied Linguistics and Engineering
The expression of doubt and certainty is crucial in academic writing where the authors have to distinguish opinion from fact and evaluate their assertions in acceptable and persuasive ways. Hedges and boosters are two strategies used for this purpose. Despite their importance in academic writing, we know little about how they are used in different disciplines and genres and how foreign language...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013